85 research outputs found

    LELIE - An Intelligent Assistant for Improving Requirement Authoring

    Get PDF
    International audienceWhen writing or revising a set of requirements, or any technical document, it is particularly challenging to make sure that texts read easily and are unambiguous for any domain actor. Experience shows that even with several levels of proofreading and validation, most texts still contain a large number of language errors (lexical, grammatical, style, business, w.r.t. authoring recommendations), and lack of overall cohesion and coherence. LELIE [a] has been designed to track these errors and, whenever possible, to suggest corrections. LELIE has obviously an impact on the technical writer behavior: LELIE rapidly becomes an essential and user-friendly authoring companion

    Korean Parsing Based on the Applicative Combinatory Categorial Grammar

    Get PDF
    PACLIC / The University of the Philippines Visayas Cebu College Cebu City, Philippines / November 20-22, 200

    Discourse structure analysis for requirement mining

    Get PDF
    International audienceIn this work, we first introduce two main approaches to writing requirements and then propose a method based on Natural Language Processing to improve requirement authoring and the overall coherence, cohesion and organization of requirement documents. We investigate the structure of requirement kernels, and then the discourse structure associated with those kernels. This will then enable the system to accurately extract requirements and their related contexts from texts (called requirement mining). Finally, we relate a first experimentation on requirement mining based on texts from seven companies. An evaluation that compares those results with manually annotated corpora of documents is given to conclude

    : Identification of fuzzy and underspecified terms in technical documents : an experiment with distributional semantics

    Get PDF
    International audienceThis study takes place in the framework of the development of linguistic resources used by an automatic verification system of technical documents like specifications. Our objective is to enlarge semi-automatically the classes of intrinsically fuzzy terms along with generic terms in order to improve the steps of identifying ambiguous elements of the system such as factors of risk. We measure and compare the efficiency of the methods of automatic distributional analysis by considering obtained results from corpora of different sizes and specialization degrees by priming from a reduced list of prime terms. We show that if a corpus of too limited size is not useable, its automatic extension by similar documents produces results that can be completed by those obtained from distributional analysis on large generic corpora.Cette étude se place dans le cadre du développement des ressources linguistiques utilisées par un système de vérification automatique de documentations techniques comme les spécifications. Notre objectif est d'étendre semi-automatiquement des classes de termes intrinsèquement flous ainsi que des termes génériques afin d'améliorer le système de détection de passages ambigus reconnus comme des facteurs de risque. Nous mesurons et comparons l'efficacité de méthodes d'analyse distributionnelle automatiques en comparant les résultats obtenus sur des corpus de taille et de degré de spécialisation variables pour une liste réduite de termes amorces. Nous montrons que si un corpus de taille trop réduite est inutilisable, son extension automatique par des documents similaires donne des résultats complémentaires à ceux que produit l'analyse distributionnelle sur de gros corpus génériques

    Identification de termes flous et génériques dans la documentation technique : expérimentation avec l’analyse distributionnelle automatique

    Get PDF
    International audienceThis study takes place in the framework of the development of linguistic resources used by an automatic verification system of technical documents like specifications. Our objective is to enlarge semi-automatically the classes of intrinsically fuzzy terms along with generic terms in order to improve the steps of identifying ambiguous elements of the system such as factors of risk. We measure and compare the efficiency of the methods of automatic distributional analysis by considering obtained results from corpora of different sizes and specialization degrees by priming from a reduced list of prime terms. We show that if a corpus of too limited size is not usable, its automatic extension by similar documents produces results that can be completed by those obtained from distributional analysis on large generic corpora.Cette étude se place dans le cadre du développement des ressources linguistiques utilisées par un système de vérification automatique de documentations techniques comme les spécifications. Notre objectif est d'étendre semi-automatiquement des classes de termes intrinsèquement flous ainsi que des termes génériques afin d'améliorer le système de détection de passages ambigus reconnus comme des facteurs de risque. Nous mesurons et comparons l'efficacité de méthodes d'analyse distributionnelle automatiques en comparant les résultats obtenus sur des corpus de taille et de degré de spécialisation variables pour une liste réduite de termes amorces. Nous montrons que si un corpus de taille trop réduite est inutilisable, son extension automatique par des documents similaires donne des résultats complémentaires à ceux que produit l'analyse distributionnelle sur de gros corpus génériques

    Large Language Models are Few-shot Testers: Exploring LLM-based General Bug Reproduction

    Full text link
    Many automated test generation techniques have been developed to aid developers with writing tests. To facilitate full automation, most existing techniques aim to either increase coverage, or generate exploratory inputs. However, existing test generation techniques largely fall short of achieving more semantic objectives, such as generating tests to reproduce a given bug report. Reproducing bugs is nonetheless important, as our empirical study shows that the number of tests added in open source repositories due to issues was about 28% of the corresponding project test suite size. Meanwhile, due to the difficulties of transforming the expected program semantics in bug reports into test oracles, existing failure reproduction techniques tend to deal exclusively with program crashes, a small subset of all bug reports. To automate test generation from general bug reports, we propose LIBRO, a framework that uses Large Language Models (LLMs), which have been shown to be capable of performing code-related tasks. Since LLMs themselves cannot execute the target buggy code, we focus on post-processing steps that help us discern when LLMs are effective, and rank the produced tests according to their validity. Our evaluation of LIBRO shows that, on the widely studied Defects4J benchmark, LIBRO can generate failure reproducing test cases for 33% of all studied cases (251 out of 750), while suggesting a bug reproducing test in first place for 149 bugs. To mitigate data contamination, we also evaluate LIBRO against 31 bug reports submitted after the collection of the LLM training data terminated: LIBRO produces bug reproducing tests for 32% of the studied bug reports. Overall, our results show LIBRO has the potential to significantly enhance developer efficiency by automatically generating tests from bug reports.Comment: Accepted to IEEE/ACM International Conference on Software Engineering 2023 (ICSE 2023

    Towards Autonomous Testing Agents via Conversational Large Language Models

    Full text link
    Software testing is an important part of the development cycle, yet it requires specialized expertise and substantial developer effort to adequately test software. The recent discoveries of the capabilities of large language models (LLMs) suggest that they can be used as automated testing assistants, and thus provide helpful information and even drive the testing process. To highlight the potential of this technology, we present a taxonomy of LLM-based testing agents based on their level of autonomy, and describe how a greater level of autonomy can benefit developers in practice. An example use of LLMs as a testing assistant is provided to demonstrate how a conversational framework for testing can help developers. This also highlights how the often criticized hallucination of LLMs can be beneficial while testing. We identify other tangible benefits that LLM-driven testing agents can bestow, and also discuss some potential limitations

    The GitHub Recent Bugs Dataset for Evaluating LLM-based Debugging Applications

    Full text link
    Large Language Models (LLMs) have demonstrated strong natural language processing and code synthesis capabilities, which has led to their rapid adoption in software engineering applications. However, details about LLM training data are often not made public, which has caused concern as to whether existing bug benchmarks are included. In lieu of the training data for the popular GPT models, we examine the training data of the open-source LLM StarCoder, and find it likely that data from the widely used Defects4J benchmark was included, raising the possibility of its inclusion in GPT training data as well. This makes it difficult to tell how well LLM-based results on Defects4J would generalize, as for any results it would be unclear whether a technique's performance is due to LLM generalization or memorization. To remedy this issue and facilitate continued research on LLM-based SE, we present the GitHub Recent Bugs (GHRB) dataset, which includes 76 real-world Java bugs that were gathered after the OpenAI data cut-off point

    A clustering approach for detecting defects in technical documents

    Get PDF
    Requirements are usually “hand-written” and suffers from several problems like redundancy and inconsistency. The problems of redundancy and inconsistency between requirements or sets of requirements impact negatively the success of final products. Manually processing these issues requires too much time and it is very costly. The main contribution of this paper is the use of k-means algorithm for a redundancy and inconsistency detection in a new context, which is Requirements Engineering context. Also, we introduce a pre-processing step based on the Natural Language Processing (NLP) techniques to see the impact of this latter to the k-means results. We use Part-Of-Speech (POS) tagging and noun chunking to detect technical busi-ness terms associated to the requirements documents that we analyze. We experiment this approach on real industrial datasets. The results show the efficiency of the k-means clustering algorithm especially with the pre-processing
    • …
    corecore